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A  NOTE  ON  LAGRANGE  MULTIPLIERS* 

R.  C.  Kao 
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A  fundamental  assumption  common  to  all  economic  analyses  is  the 
maximization  or  minimization  of  some  objective  function  (representing, 
say,  utility,  cost,  welfare  or  whatnot)  subject  to  certain  constraints. 
Statements  of  the  type:  "A  consumer  with  given  income  msximizes  his 
total  utility  only  if  his  marginal  utilities  for  the  various  conmodi- 
ties  are  proportional  to  their  prices,"  are  almost  commonplace  in 
economic  texts  and  are  generally  described  as  "equilibrium  conditions" 
of  the  optimization  process  under  consideration.  Nevertheless,  when 
these  meaningful  theorems  are  presented  to  even  the  mo?.*e  advanced 
students,  the  argument  is  usually  shrouded  with  a  complete  or  partial 
mystery  around  the  so-called  Lagrange  multipliers.  Very  little 
explanation  is  given  to  these  multipliers  themselves  except  that 
they  are  the  coefficients  used  to  form  a  certain  Lagrangian  function, 
the  extremization  of  which  leads  to  the  desired  equilibrium 
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£flnrtlt,lonit.  note  is  devoted  to  an  intrinsic  (i.e.,  geometric) 

characterisation  of  these  multipliers  and  a  natural  reformulation  of 
the  equilibrium  conditions  that  penults  a  better  insight  into  the 
nature  of  constrained  extremum  problems  in  economics. 

Let 


T 


y  «  f(x)  (i) 

be  a  real -valued  function  of  a  single  variable  x.  TLe  function  f 
may  represent,  for  example,  the  short-run  cost  curve  of  a  production 
process  with  only  one  variable  factor.  If  f  is  sufficiently  smooth 
(i.e.,  x  is  infinitesimally  divisible),  a  necessary  condition  for  a 
(relative)  minimum  of  (l)  is,  as  is  well  known, 

g-r(x).o,  (2) 

and  a  sufficient  condition  for  a  Jrelative)  minimum  of  (l)  is  )2) 
plus 

^  =  f”(x)  =  jj  f’(x)  >  0.  (3) 

dx 

Geometrically,  (2)  states  that  the  tangent  vector  to  the  curve  C 
defined  by  (l)  must  be  horizontal;  and  (3)  states  that  it  is  increas 
ing  in  slope  around  any  root  x°  of  (2).  ^Figure  1.) 

A  more  easily  generalizable  geometric  interpretation  of  13)  is 


* 

Cf .  inter  alia  the  following  well  known  economic  texts: 

R.  G.  D.  Allen,  Mathematical  Analysis  for  Economists,  London: 
Macmillan,  19**9,  pp.  356-367;  idem,  ^thematical  Economics,  London: 
Macmillan,  1956,  pp.  6l0,  filijTTrv.  Bushaw  and  R.  W.  Clower,  Intro- 
duction  to  Mathematical  Economics,  Homewood,  Illinois:  Irwin,  1957> 
p.  331;  J.  M.  Henderson  and  R.  E.  Quandt,  Microeconomic  Theory,  A 
Mathematical  Approach,  New  York:  McGraw-Hill,  1958,  pp.  ZU-Wk; 

J.  R.  Hicks,  Value  and  Capital,  2nd  ed .,  London:  Oxford,  1956, 
p.  305;  ?•  A.  Ssmuelson,  Inundations  of  Economic  Analysis,  Cambridge, 
Massachusetts:  Harvard,  pp.  362-365;  and  Taro  Yamane,  Mathematics 
for  Sconomlstg,  Englewood,  New  Jersey:  Prentice -Hail,  19&,  pp.  11 6- 
123. 
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Figure  1 


the  ■following:  Any  function  f  possessing  sufficient  number  of 

derivatives  (i.e.,  sufficiently  smooth)  may  be  expanded  into  a 

* 

Taylor's  series: 


df  /  o\  .  ,  d2f  Ov2 


f(x)  =  f(x°)  +  Hi  (x  -  x°)  +  *  ^  (x  -  x°y 


dx~ 


♦  ...  +i.S^(x.xO)n+  ... 

n!  dxn 


(*> 


where  all  derivatives  are  to  be  evaluated  at  x°.  (4)  states  roughly 

that  the  value  of  f  at  x  may  be  represented  by  its  value  at  x° 
together  with  those  of  all  derivatives  of  f  at  x°.  Consequently, 
if  x°  is  to  be  a  relative  minimum,  all  :ufficiently  close  neighbor¬ 
ing  x  must  not  yield  a  smaller  y  »  f(x),  i.e., 

f(x)  -  fix0)  *  0  ,  (5) 

in  terms  of  (4), 

i  (x  -  x0)2  >  0  (6) 

dx* 

See,  e.g.,  R.  C.  Buck,  Advanced  Calculus,  New  York:  McGraw- 
Hill,  1956,  pp.  75-77. 


•ince,at  x°,  -  0;  cod  if  x  is  sufficiently  close  to  x°,  the  tern 

shown  in  (6)  will  dominate  the  combined  effect  of  all  succeeding 
terms  because  these  remaining  terms  contain  x  -  x°  to  a  higher 
degree.  That  (6)  is  equivalent  to  (3)  is  obvious. 

If  f  is  now  a  function  of  tvo  Independent  variables,  (l)  may 
be  rewritten  as: 

y  -  *2)  (7) 

and  a  pair  of  necessary  conditions  corresponding  to  (2)  are 

^--V°  •  3^  -  V  ° '  (8) 


These  conditions  state  that  the  tangent  vectors  to  the  surface  S 
defined  by  (7)  in  the  directions  of  increasing  x^  and  increasing 
must  be  horizontal,  that  is,  the  tvo  tangent  vectors  must  be 
parallel  to  the  x^x,,  -  plane.  (Figure  2.)  If  f  is  sufficiently 
smooth,  its  Taylor  expansion  around  any  root  x°  of  (8)  is  given  by 


X2^  *  f(xl>  x2^  +  cEc^  ^Xl’Xl^  *  <Ec^  ^'2~X2^ 


df 


+  4  { <xrxi>2  +  2  (vx°>  lx2-xl} 


s2f 


)9) 


*  $-§  (X2'X2>2  )  +  •" 


By  an  argument  similar  to  that  used  to  derive  (6),  a  sufficient 
condition  for  x°  to  be  a  (relative)  minimum  is  (8)  plus 

o,2 


*  I  ^  W'V  +  2  ^Xl‘xl^  ^x2*x2^ 


a2f 


(10) 
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Tht  tangent  vectors  (dx^,  0)  ■  (x^  -  x£,  0)  and  (0,  dx^)  = 

(0#  Xg  -  x°)  to  the  surface  (7)  at  x°  determine  a  2-dimensional 
tangent  plane  dS(x°)  to  S.  Since  x°  is  to  be  a  relative  minimum, 
all  sufficiently  close  neighboring  points  must  not  yield  a  smaller 
y,  points  on  the  tangents  (dx^,  0),  (0,  dxg)  being  only  special  cases. 

a 

More  generally,  points  on  any  tangent  vector  dx  «  (di^,  dx2)  to  S 
at  x°  which  is  a  linear  combination  of  (dx^,  0),  (0,  dxg)  must  also 
not  yield  a  smaller  y.  Since  (dx^,  0),  (0,  dxj  span  (i.e.,  form  a 
basis  of}  dS(x°),  dx  may  be  represented  as 

(dx^  dx2)  -  cos  (dx1,  0)  +  cos  c*2  (0,  dx2)  (ll) 

where  cos  a^,  cos  a2  are  the  direction  cosines  of  dx  with  respect 
to  the  local  (orthogonal)  coordinate  system  on  dS(x°)  with  origin 
at  x°  and  directions  (dx^,  0),  (0,  dx2) .  Consequently,  a  strengthened 
necessary  condition  for  a  relative  minimum  at  x°,  which  includes  the 
two  conditions  in  (8),  is 

vd£f  3  308  ai  +  3^  eos  a2  =  0  (12) 

whefe  cos  ,  cos  a2  are  the  direction  cosines  of  an  arbitrary 

tangent  vector  dx  to  S  at  x°,  and  bf/bx^,  bf/bx^  are  the  components 

of  the  normal  to  dS  (i.e.,  the  gradient  vector  Vf  to  S).  V^-f, 

» 

defined  to  be  the  left  side  of  (12),  is  called  the  directional 
derivative  of  f  in  the  direction  of  dx.  For  =  0,  au  *  n/ 2, 
dx  «  (dx^,  0)  and  ( 12)  yields  the  first  condition  in  (8);  for 
=  n/2,  a2  «  0,  dx  »  (0,  dx2)  and  the  second  condition  in  (8) 
obtains . 

Moreover,  a  strengthened  sufficient  condition  analogous  to  (3) 
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for  a  relative  minimum  at  x°  is,  by  taking  again  the  Erectional 

derivatives  of  f  ,  f  in  the  direction  dx, 

X1  X2  ' 

?dif  "  "  *dx  008  “l  +  3^  008  0(2)  = 

^di  fx^  008  «1  +  <*Ai  fx2}  008  a2  "  (3^  \  COS  °L  + 


(13) 


3^  \  008  a2>  008  “l  +  <3^  fx2  C0S  V  3^  \  -08  a2>  C0S  a2 

y  c°s2  °i + 2  5^ cos  °i cos  a2 + y cos2  a2  *  °  ■ 

The  tangent  plane  dS(*)  at  any  point  £  on  S  is  defined  by  the 
linear  terns  in  the  expansion  (9),  i.e., 

y  -  *■(*!»  x2)  °  Sc^"  (X1  ■  %)  +  (x2  *  x2^ 


where  (x^,  y)  is  now  a  point  in  dS(x),  and  the  partial  deriva¬ 
tives  are  to  be  evaluated  at  x.  To  put  the  matter  differently,  if  8 
itself  is  a  plane,  then  the  expansion  (9)  at  any  point  on  it  must 
be  exact  up  to  and  including  the  linear  terms,  i.e.,  all  higher- 
o^er  terms  must  vanish  identically.  The  normal  to  the  tangent 
plane  dS(x)  has  components  proportional  to  (bf/bx^  bf/bx g,  -1). 

At  a  relative  minimum  point  x°  where  S  and  dS(x°)  are  tangent 
to  each  other,  y  -  f(x°,  x°)  and  the  left  bide  of  (l4)  vanishes. 
Consequently,  dS(x°)  must  be  parallel  to  the  x^^-plane  (called  the 


base  plane)  but  at,  distance  f(x°,  from  it.  For,  in  that  case 
the  right  side  of  (lk)  can  Just  as  well  be  written  as 

3?  (xl  '  xl>  +  3ST  (x2  -  X2>  +  (-X)  -  f(xl'  x2>  l  ■  0  to) 


f 
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whcre  y  «  f(x£,  x°)  identically  in  dS(x°)>  showing  the  orthogonality 
between  (df/dxp  df/<bc2,  -1)  and  (x^  -  x£,  xg  -  x°,  0),  a  vector  in 
the  base  plane.  However,  the  right  side  of  (1*0  is  the  same  as 
*dxf  defined  in  (12)  if  we  choose  a  point  (x^,  xg,  f(x£,  x°))  in 
dS(x°)  such  that 

3^  -  xj  «  COB  ,  x2  -  x®  *  COB  a2  ,  y  -  f(x£,  x°)  s  0  .  (l6) 

Since  (cos  a^,  cos  a^)  represents  a  unit  vector  with  respect  to  the 
local  coordinate  system  in  £5(x°),  is  precisely  the  component 

(i.e.,  projection)  of  Vf  in  the  direction  dx.  (12)  states,  there¬ 
fore,  that  at  a  critical  point  x°  on  S,  the  projection  of  the 
gradient  vector  yf  in  every  direction  dx  in  dS(x°)  vanishes,  where 
dS(x°)  is  parallel  to  the  base  plane.  As  a2  in  (l6)  can  be 

arbitrarily  chosen,  the  right  side  of  (1*0  can  vanish  only  if  f  ,  £ 

X1  x2 

themselves  vanish,  justifying  (8).  Moreover,  at  a  ncncritical 
point  &  of  S,  a  point  (x^,  x2,  y)  in  dS(x)  will  generally  have  its 
last  component  y  not  identically  equal  to  f(x^,  &2) ,  In  fact,  these 
will  usually  be  equal  only  for  the  tangency  point  between  S  and  dS(£); 
therefore,  the  right  side  of  (i*0  does  not  now  vanish  always.  It 
may  vanish  for  some  directions  dx  ir  dS(x) .  These  results  apply, 
in  general,  to  spaces  of  dimensions  greater  than  2  also. 

On  the  basis  of  the  above  geometric  argument,  it  is  now 
possible  to  give  an  intrinsic  characterization  of  Lagrange  multi¬ 
pliers.  Consider,  for  example,  a  constrained  minimum  problem  of 
the  following  type:  Minimize  (7)  subject  to 

*2)  '  0  • 


(17) 
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(17)  defines  a  curve  in  the  base  plane,  and  minimum  of  f  is  to  be 
sought  over  all  points  x  *  (x^,  x2)  lying  on  this  curve  C.  At  any 
such  (relative)  minimum  point  x°,  the  directional  derivative  7f  of 
f  along  the  tangent  to  C  must  vanish  by  (12),  where  cos  a^,  cos 
denote  the  components  of  the  unit  tangent  dx  to  C  at  x°.  However, 
(17)  shows  that 

7dx8  5  sj  cos  °L  +  ^  C0S  “2  =  0  ‘  <l8> 

also  at  thl  s  point .  Consequently,  ^  vdxg  mU8^  c°mnear> 

i.e.,  for  some  scalar  \ 


?dxf  =  X  ?dx«  .  (19] 

But,  (19)  is  equivalent  to 

fx  *  X  '  fx  =  X  Sx  (») 

*1  1  x2  2 

which  are  the  usual  conditions  derivable  from  differentiation  of 
the  Lagrangian  function.  A  sufficient  condition  for  a  relative 
minimum  at  x°  is  (13)  with  cos  a^,  cos  ag  being  again  the  components 
of  dx  (tangent  to  C  at  x°)  with  respect  to  the  local  coordinate 
system  at  x°. 

Generalization  of  the  above  geometric  characterization  of 
Lagrange  multipliers  to  spaces  of  higher  dimensions  is  imaediate. 


y  -  f(V  V  xn} 

aguin  denote  the  objective  function  to  be  extremized,  and 
g  (Xi,  x2,  ...»  x  )  -  0  (J  =  1,  ....  r  <  n) 


denote  a  set  of  independent  side  constraints.  Each  defines  a 


(22) 
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bypersurface  in  the  base  plane  [i.e.,  x^Xg. . .x^-plene  in 
(o+l) -dimensional  space  E11*1  with  the  last  axis  denoted  by  y].  The 


intersection 


n  s 

«i 


(23) 


of  these  hypersurf  aces  in  general  yields  an  (n-r)-diwnsional  sur¬ 
face  in  the  base  plane.  At  a  critical  point  x°  *  (x^, ...,x^)  on 

S.  ~  (which  is  C  in  the  preceding  example ),  a  tangent  space  dS.  0 
id.  > . r  lei • *r 

generally  exist6  with  basis  vectors  (dx)  ,  . ..,  (dx)  ’  ,  and  the 

4 

directional  derivative  vf  of  f  along  each  such  basis  vector  or  their 
linear  combinations  must  vanish.  This  rays  that  vf  must  be  orthogo¬ 
nal  to  dS12  ,  or  Vf  lies  in  dScj^  ,  the  orthogonal  complement 

of  dS10  at  x°.  But  (22)  shows  that 
JLc  •  •  *r 


Z  iidx  -  0  (J  -  1,  •  *.,  r)  (2h) 

i»l  i 

at  x°  also.  Hence,  if  dx  3  (dx1,  ...,  dxn)  is  chosen  to  range  over 

the  basis  vectors  (dx)%  . (dx)  ’r  of  dS^  ,  (24)  merely  shows 

that  each  vgj  (j»l,  2,  r)  is  also  orthogonal  to  dS12  But 

if  gj  r)  are  independent,  vg1,  ...,  Vg^  (j-1,  ...,  r) 

would  form  a  basis  for  as*  since 

id. . . r 

dim  dS12>r  +  dim  dS^2  r  =  n  (25) 


at  a  regular  point  on  S12  y.  Therefore,  for  some  scalars  X^,  ...,  \ r 
we  must  have 


r 

Vf  »  L  \  vg-  *  (26) 

J-l  0  0 


expressing  linear  dependence  of  Vf  on  7g^  (j=l,  ...,  r) .  (19)  gives, 

in  component  fora, 


-u- 


1  J-l 


r  dg 

L.  xa  S? 


(i  ^  1,  . . . ,  n)  , 


the  more  familiar  equilibrium  conditions.  In  (27),  there  are  n 

equations  in  n+r  unknowns  x. ,  . ..,  x  ;  X.,  ....  X  .  But  since 

x  n  1  r 

(x^,  . ..,  x^)  must  also  satisfy  (22),  r  additional  equations  must 

be  included.  Consequently,  the  Lagrange  multipliers  are  merely 

coefficients  used  in  expressing  a  certain  necessary  linear 

dependence  relation  of  the  gradient  vector  tc  the  surface  defined 

by  f  on  those  to  surfaces  defined  by  (j*l,  ...,  r). 

The  sufficiency  condition  is  also  easily  generalized.  With 

respect  to  the  basis  vectors  (dx)1,  ...,  (dx)n"r  of  dS^  r,  a 

typical  unit  tangent  vector  dx  in  dS.  0  has  the  form 

12 • . . r 

n-r  k 

dx  *=  E  (dx)  cos  <yv  ( 


where  cos  o^,  ...,  coe  aQ_r  are  the  direction  cosines  of  dx  with 
respect  to  (dx)1,  ...,  (dx)n’r.  Then 

9  n“r  $2f 

^=1  C08  “i  C0B  aj  ==  0  (29; 

together  with  (27)  yields  a  relative  constrained  minimum  at  x°. 
Alternativel; ,  if  z  *  (z^,  ...,  zq)  is  any  vector  in  the  base  plane, 
a  relative  constrained  minimum  at  a  point  x°  is  assured  by  (27)  and 


z'Hz  -  (zp  ...,  zq) 


5T5T 

1  n 


z .  zA  >  0 


i,5-i  1  J 


for  all  x  orthogonal  to  g^,  . —  that  la,  for  all  z^,  x^ 
satisfying 

n  bg. 

E  X?  3i  -  0  (J  -  1,  •••,  r)  .  (31) 

i=l  i 

That  (30)  and  (31)  may  be  translated  into  appropriate  properties  of 
the  bordered  Hessian 


bgi 

agi 

0  . 

.  .  0 

dgr 

dgr 

0  •  • 

.  .  0 

^  • 

5T 

R 

. 

.  .  dgr 

&  .... 

a2f 

*4 

dx^  Sx. 

agl 

a«r 

b2t 

s2f 

sr  •  • 

n 

•  •  ST 

n 

•  etc2 

may  also  be  readily  established . 


